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ABSTRACT 

The radio astronomy community is currently building a 
number of phased array telescopes. The calibration of 
these telescopes is hampered by the fact that covariances 
of signals from closely spaced antennas are sensitive to 
noise coupling and to variations in sky brightness on 
large spatial scales. These effects are difficult and com- 
putationally expensive to model. We propose to model 
them phenomenologically using a non-diagonal noise co- 
variance matrix. The parameters can be estimated us- 
ing a weighted alternating least squares (WALS) algo- 
rithm iterating between the calibration parameters and 
the additive nuisance parameters. We demonstrate the 
effectiveness of our method using data from the low fre- 
quency array (LOFAR) prototype station. 

1. INTRODUCTION 

The radio astronomical community is currently con- 
structing a number of large scale phased array telescopes 
such as the low-frequency array (LOFAR) [I] and the 
Murchison wide field array (MWA) [2|. These instru- 
ments need to be calibrated regularly to track varia- 
tions in the electronics of the antennas and receivers, 
as well as direction dependent variations of the iono- 
sphere [3 a . E.g., LOFAR will consist of order 50 sta- 
tions (distributed over an area of a hundred kilome- 
ters or more), where each station consists of 96 "low 
band" dual-polarized dipole antennas (10-90 MHz) and 
96 "high band" antennas (110-240 MHz). The latter 
antennas are in turn composed of 16 beamformed dual- 
polarized droopy dipole antennas. Each station provides 
a number of beamformed outputs, which in turn are cor- 
related at a central location to form images and other 
astronomy products. 

In this paper we focus on the calibration of the sta- 
tion antennas. The general aim is to estimate the direc- 
tion independent gains and phases of each sensor, as well 
as the direction dependent gains corresponding to each 
source. This is done for the 2-10 brightest sources in 
the sky, assuming a point source model. The problem 
is complicated by the fact that the covariances of sig- 
nals from closely spaced antennas within a station are 
sensitive to noise coupling (for the lowest frequencies, 
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the antennas are spaced closer than half a wavelength) . 
Also, the point source model does not entirely hold be- 
cause of bright emission from the plane of the galaxy, 
extending over the entire sky. Fortunately, this emission 
is spatially smooth, which implies that it is dominant on 
the short spatial scales in the array aperture, i.e. on the 
short baselines. In this paper we propose to model both 
short baseline effects by an additive noise covariance ma- 
trix, which in this case is not diagonal and has unknown 
entries for each short baseline. If we can estimate this 
matrix, a simple point source model will be sufficient 
to calibrate the array, which reduces the problem to a 
problem for which solutions are readily available [IHS]. 

Direction finding problems for calibrated arrays in 
the presence of unknown correlated noise have been 
extensively studied in the 1990s. It was proven that 
the general problem is not tractable without imposing 
some appropriate constraints on the noise covariance 
matrix or exploiting differences in temporal character- 
istics between source and noise signals [7] . Radio astro- 
nomical signals generally behave like noise, thus tempo- 
ral techniques (instrumental variables) are not applica- 
ble. Instead, we should rely on an appropriately con- 
strained parameterized model of the noise covariance 
matrix. Starting with [8], a series of papers were pub- 
lished; see [9] for an overview. ML estimators for the 
source and instrument parameters under a generalized 
noise covariance parameterization is provided in [9lll0j. 
whereas nonlinear least squares estimators were stud- 
ied in [T0rfT2] . In either case, an analytic source and 
instrument parameter dependent solution is derived for 
the noise model parameters which is substituted back 
into the cost function. This cost function then has to be 
minimized using a generalized solving technique, such 
as Newton iterations. This approach works well if the 
number of instrument and source parameters is small. 
For larger problems (we consider 100 antenna/source pa- 
rameters and over 750 noise covariance parameters), it 
is convenient to exploit suboptimal but closed-form an- 
alytic solutions, at least for initialization. We therefore 
propose a weighted alternating least squares (WALS) 
approach which iterates over noise, source and instru- 
ment parameters. The proposed method can thus be 
regarded as an extension to the methods proposed in [6] 

Notation: The transpose operator is denoted by 
T , the complex conjugate (Hermitian) transpose by , 



complex conjugation by (•) and the pseudo- inverse by 

t. An estimated value is denoted by (•). ® denotes the 
Kroncckcr product and o is used to denote the Khatri- 
Rao or column-wise Kronecker product of two matri- 
ces. vec(-) converts a matrix to a vector by stacking the 
columns of the matrix. 

2. DATA MODEL AND PROBLEM 
STATEMENT 

We consider an array of P antennas. The measured 
P x P array covariance matrix can be modeled as 



R = Ro (0) + E, 



(1) 



where Ro (0) is the signal model for an ideal noise free 
array, which depends on a number of unknown real val- 
ued parameters accumulated in a column vector 0, and 
E ra is a P x P matrix describing the noise corruption. 
Note that S„ must be Hcrmitian, since the array covari- 
ance matrix is a Hermitian matrix. In our application, 
the data covariance model is 



R (0) = GiAG 2 S s Gf A^Gf 



(2) 



where S s is the covariance of the point sources (assumed 
to be known from tables), A contains the direction 
vectors, Gi a (diagonal) instrumental gain/phase ma- 
trix, and G2 a (diagonal) direction dependent gain ma- 
trix [6J . If the direction dependent gains are unknown, 
we can introduce E = G2E S G^, which implies that we 
should estimate the apparent source powers. The direc- 
tion dependent gains follow directly from the apparent 
source powers if the actual source powers are known, 
e.g. from tables. If the source positions are unknown 
or perturbed by propagation conditions, A may be pa- 
rameterized [BJ. The contents of the parameter vector 
therefore strongly depend on the available knowledge 
on the instrument, the propagation conditions and the 
sources. 

As in 8 and subsequent papers, the unknown noise 
covariance matrix is modeled as a linear sum of known 
matrices, in this case simple selection matrices E^ which 
are zero everywhere except for a '1' in entry 



E n 



The set S contains the index pairs of the short base- 
lines, including the autocorrelation entries The 
unknown coefficients <7jj are the nuisance parameters. 

In the absence of the noise corruptions, i.e., R 
Hi (0), the estimation problem to find the parameter 
vector is commonly formulated either as a ML prob- 
lem, or as a generalized least squares estimation problem 



9 



argmm 
9 



w(R-R o (0))w 



(3) 



where R is the measured array covariance matrix. It is 
known that with W = R -1 / 2 this estimator is asymp- 
totically unbiased and asymptotically efficient [TOlfT^] . 
Indeed, the simulations in [10 show only a small im- 
provement of the ML solution as compared to the WLS 



solution. For Ro(0) given in @, this problem, as well 
as the case with an unknown diagonal noise covariance, 
was studied by us in [BJ, in the present paper, we will 
assume that a solution to this problem is available. 

In the presence of correlated noise on the short base- 
lines, the problem is extended to 



argmm 
0,<x„ 



W R - Ro (0) - E„ W 



(4) 

where <r n is a vector containing all unique real valued 
parameters required to describe the nonzero entries of 
E„. This vector can be related to E„ using a selection 
matrix I s such that vec(E„) = l s er n . By choosing the 
selection matrix appropriately, we can ensure that the 
estimated E n is Hermitian. 

3. PARAMETER ESTIMATION 
3.1 Weighted Alternating Least Squares 

We propose to solve the problem in Eq. ([4} by alternat- 
ing between weighted least squares (WLS) estimation 
of the desired parameters and WLS estimation of the 
nuisance parameters <r n . The first WLS problem can be 
formulated as 







argmm 




W R-E„ -R o (0) W 



(5) 



which is identical in form to Eq. ([3]) for which a solution 
is assumed to be available. 

The second WLS problem can be formulated as 



argmm 
argmin 

(T „ 



W^R-Ro (0) 

(W <8> W) vec (R - R (0 



W 



2 
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(G) 



The solution to this problem is given by 
5n = ((W<8>W) I s ) f (W®W)vec (r-Rq (0) 



(7) 

In section [2] it was mentioned that W = R 1 ' 2 pro- 
vides optimal weighting for the LS cost function. Since 

R is not known, R is used instead in many applications. 
It can be shown that this may lead to a bias in the es- 
timate of <x n for a finite number of samples [BJ. This 
bias can be avoided by using the best available model 

R ^0, CT n J instead of R. 

Estimation of receiver noise powers is a special case 
of the general problem treated here. In this case E„ is 
a diagonal, and I s = I o I where I is the P x P identity 
matrix. This form of selection matrix simplifies Eq. (J7J 
considerably [BJ. In some problems, for example in es- 
timating the receiver based gains, it is then possible to 
simply ignore the diagonal entries instead of including 
nuisance parameters [5]. It can further be shown that 
ignoring the corrupted entries instead of including them 



using nuisance parameters does not change the Cramer- 
Rao bound of the parameters of interest [13] . This can 
be explained intuitively by regarding the matrix equa- 
tion describing the WLS problem as a set of scalar equa- 
tions. If a unique nuisance parameter is added to one of 
those scalar equations, that equation is required to solve 
for the nuisance parameter and can thus not be used to 
solve any other parameters. This implies that this equa- 
tion could have been ignored if one would only focus on 
the parameters of interest. In practice, however, it may 
be hard to develop an algorithm that ignores the cor- 
rupted entries in a statistically efficient way. 

3.2 Algorithm 

The resulting algorithm is as follows: 

1. Initialization Set the iteration counter i — 1 and ini- 
tialize a^} based on any prior information if avail- 



able, otherwise initialize <x 
W = R- 1 /2. 



[0] 



to zero. Initially use 



>1 



by solving the WLS problem formu- 
([5]) using <x[^ 1 ' as prior knowledge. 



2. Estimate 
lated in Eq 

3. Estimate a$ using Eq. ([7]) using ' as prior knowl- 
edge. 

4. Update W = R -1 / 2 to avoid the bias mentioned in 
the previous section. 

5. Check for convergence, otherwise continue with step 
2. 

An algorithm that altcrnatingly optimizes for dis- 
tinct groups of parameters, in our case 6 and <x n , can 
be proven to converge if the value of the cost function 
decreases in each iteration. We assume that a suitable 
method is available to find 8. Since we propose to es- 
timate <r„ using the well known standard solution for 
least squares estimation problems, the value of the cost 
function will decrease in both steps, thus ensuring con- 
vergence. Although there is no guarantee that the al- 
gorithm will converge to the global optimum, practical 
experience with LOFAR and results from Monte Carlo 
simulations in this paper and in earlier papers 0[6] in- 
dicate that the proposed method produces good results 
for most reasonable initial estimates. 

4. EXPERIMENTAL RESULTS 

4.1 The LOFAR prototype station 

The first full-scale LOFAR prototype station with real- 
time backend became operational in the second quar- 
ter of 2006 P3]. This station consisted of 48 dual- 
polarization antennas operating between 10 and 90 MHz 
arranged in a randomized configuration based on rings 
with exponentially increasing radii as shown in Fig. [TJ 
Each of the two signals from every antenna was filtered 
and digitized using a 12-bit 200 MHz ADC. A real-time 
FPGA based digital processing backend splits the 100 
MHz Nyquist sampled base band in 512 subbands, each 
195 kHz wide, using a polyphase filter. The backend 
also provides a correlator which can correlate in real- 
time the data from the 96 input channels for a single 
subband. This subband may be any of the 512 available 
subbands and this choice may change every second. 
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Figure 1: Array configuration of the 48 antenna LOFAR 
prototype station. 
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Figure 2: Calibrated all-sky map for a single polariza- 
tion at 50 MHz from the 48-element LOFAR prototype 
station. The image shows the sky projected on the hori- 
zon plane of the station. 



4.2 Results 

For the demonstration in this paper we used data from a 
single 1 s snapshot observation in the 195 kHz subband 
centered at 50 MHz. This observation was done on 14 
February 2008 at 1:42:07 UTC. We will calibrate the 
data using the method described in Sec. 13.21 where we 
will model correlated noise terms on all baselines shorter 
than four wavelengths. The complete WLS problem 
thus implies estimation of the amplitudes (48 param- 
eters) and phases (47 parameters) of the antenna based 
complex gains, the source power ratio of the two bright- 
est sources (1 parameter) and 764 real valued nuisance 
parameters describing all non-zero entries of the noise 
covariance matrix for a total of 860 free real valued pa- 
rameters per polarization. In this experiment we show 
that use of such nuisance parameters can reduce the 
complex source structure on the sky to a simple model 
with just two point sources. 
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Figure 3: Calibrated all-sky map of extended emission 
observed at baselines shorter than four wavelengths for 
a single polarization at 50 MHz. 



Figure [5] shows a calibrated all sky map for a sin- 
gle polarization. There are two bright point sources 
near the northeastern horizon. The image also shows 
a lot of extended emission from the galactic plane (on 
the northwestern horizon) and the north polar spur (on 
the eastern horizon). This extended emission is hard 
to model accurately, but only affects the short baselines 
since short distances in the aperture plane of a phased 
array correspond to low spatial frequencies, which de- 
scribe the structure on large spatial scales. 

It was found that most of this extended emission is 
captured by the correlations on baselines shorter than 
four wavelengths. This affects 358 crosscorrelations and 
the 48 autocorrelations resulting in the aforementioned 
764 real valued parameters to describe the non-zero en- 
tries of S n . Using the procedure outlined in the pre- 
vious section, <x n was estimated simultaneously with 
containing the other parameters. S„ can therefore be 
interpreted as an estimate of the extended source struc- 
ture, noise coupling and receiver noise powers. This 
is nicely demonstrated in Fig. [3] which shows an image 
based on S n after the calibration was completed. 

Figure 0] shows the difference between the maps 
shown in Figs. [5] and [3] This shows that S„ provides a 
description of the extended emission that is sufficiently 
accurate to reduce R— S n to an array covariance matrix 
that can be described by a model consisting of only two 
point sources. This thus reduces our original problem 
to one that has been discussed extensively in the array 
signal processing literature. 

5. IMPROVING THE COMPUTATIONAL 
EFFICIENCY 

The calculation of <r„ using Eq. [7] forms the most expen- 
sive part of the algorithm in terms of CPU and memory 
usage due to the Kronecker products. These Kronecker 
products can only be reduced to simpler Khatri-Rao or 
Hadamard products in a number of special cases, such as 
a diagonal noise covariance matrix treated in [5]. How- 
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Figure 4: Difference between the calibrated all-sky map 
shown in Fig. [2] and the contribution of extended emis- 
sion shown in Fig. [3] showing that the remainder can be 
accurately modeled using a model with just two point 
sources. 
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Table 1: Source powers and source locations used in the 
simulations 

ever, the parameterization of S n chosen here implies 
that each entry of the array covariance matrix with a 
contribution from E„ is affected by a unique additive 
parameter. Intuitively, one would therefore expect that 
the weighting in Eq. ([7J would not make much differ- 
ence. Omitting this would reduce the CPU and memory 
requirements considerably, since the Kronecker products 
and the inverse increase the numerical complexity from 
o (7VP 2 ) to o (N 3 + P 6 ) and the size of the largest ma- 
trix from P 2 x N to P 2 x P 2 , where N is the number 
of noise parameters stacked in cr n . 

This idea was therefore tested in Monte Carlo sim- 
ulations. For these simulations, a five armed array was 
defined, each arm being an eight-element, one wave- 
length spaced ULA. The first element of each arm 
formed an equally spaced circular array with half wave- 
length spacing between the elements. The source model 
is presented in Table [T] This source model was gener- 
ated with a random number generator to verify that the 
proposed approach works for arbitrary source models. 

Figure [5] compares the variance on the estimates for 
the omnidirectional complex gains of the receiving ele- 
ments and the source powers obtained after 100 runs of a 
Monte Carlo simulation with weighted LS estimation of 
<r„ and the computationally more efficient unweighted 
LS estimation of tr n . The complex receiving element 
gains and the source powers, stacked in 6, were esti- 
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Figure 5: Comparison of the variance on the direc- 
tion independent complex gain and the source power 
estimates obtained in Monte Carlo simulations with 
weighted LS estimation of er„ and unweighted LS es- 
timation of cr n . 



mated using weighted least squares in both cases. Both 
algorithms generally converged within three iterations 
to relative error per parameter of ~ 10~ 4 , i.e. well below 
the Cramer-Rao bound. The convergence rate in these 
simulations was one digit per iteration down to the nu- 
merical accuracy provided by double precision floating 
point numbers. 

The results indicate that the variance on these esti- 
mates is the same in both cases within the accuracy 
provided by the simulations. We therefore conclude 
that it is viable to discard the weighting in Eq. (|7|). 
With this modification all 860 free parameters in the 
experiment described in the previous section could be 
extracted from the actual data using Matlab running 
on a standard dual core 2.4 GHz CPU in only 0.4 sec- 
onds. This implies that a single 2.4 GHz core can keep 
up with the data from the correlator at the LOFAR 
station, which real-time correlates the antenna signals 
for a single subband with one second integration time. 
This update rate is required to track variations in the 
electronic gains and the ionosphere. 

6. CONCLUSIONS 

We have demonstrated using data from a LOFAR proto- 
type station that the effects of noise coupling, receiver 
noise powers and extended emission on a radio astro- 
nomical phased array can be phenomenologically de- 
scribed by a non-diagonal noise covariance matrix with 
non-zero entries on short baselines. These entries can be 
computationally efficient and accurately estimated by a 
WALS algorithm alternating between estimation of the 
correlated noise parameters and calibration parameters. 
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